A Syntactic Valency Lexicon for Persian Verbs: The First Steps towards Persian Dependency Treebank
نویسندگان
چکیده
Valency lexicons are valuable resources for natural language processing. The need for new resources for languages encourages researchers to collect new datasets. One of the most important datasets is valency lexicons. In valency lexicons, information about obligatory and optional complements of words is annotated at the syntactic and semantic levels. In this paper, we report the development of the first syntactic valency lexicon of Persian verbs. This lexicon is part of the Persian Dependency Treebank Project. The lexicon consists of 4282 distinct verb lemmas and 5429 distinct verb-valency pairs.
منابع مشابه
An annotation scheme for Persian based on Autonomous Phrases Theory and Universal Dependencies
A treebank is a corpus with linguistic annotations above the level of the parts of speech. During the first half of the present decade, three treebanks have been developed for Persian either originally or subsequently based on dependency grammar: Persian Treebank (PerTreeBank), Persian Syntactic Dependency Treebank, and Uppsala Persian Dependency Treebank (UPDT). The syntactic analysis of a sen...
متن کاملExtending the coverage of a MWE database for Persian CPs exploiting valency alternations
PersPred is a manually elaborated multilingual syntactic and semantic Lexicon for Persian Complex Predicates (CPs), referred to also as “Light Verb Constructions” (LVCs) or “Compound Verbs”. CPs constitutes the regular and the most common way of expressing verbal concepts in Persian, which has only around 200 simplex verbs. CPs can be defined as multi-word sequences formed by a verb and a non-v...
متن کاملPersian Proposition Bank
This paper describes the procedure of semantic role labeling and the development of the first manually annotated Persian Proposition Bank (PerPB) which added a layer of predicate-argument information to the syntactic structures of Persian Dependency Treebank (known as PerDT). Through the process of annotating, the annotators could see the syntactic information of all the sentences and so they a...
متن کاملUniversal Dependencies for Persian
The Persian Universal Dependency Treebank (Persian UD) is a recent effort of treebanking Persian with Universal Dependencies (UD), an ongoing project that designs unified and cross-linguistically valid grammatical representations including part-of-speech tags, morphological features, and dependency relations. The Persian UD is the converted version of the Uppsala Persian Dependency Treebank (UP...
متن کاملFeature Engineering in Persian Dependency Parser
Dependency parser is one of the most important fundamental tools in the natural language processing, which extracts structure of sentences and determines the relations between words based on the dependency grammar. The dependency parser is proper for free order languages, such as Persian. In this paper, data-driven dependency parser has been developed with the help of phrase-structure parser fo...
متن کامل